7 research outputs found

    A practical approximation algorithm for solving massive instances of hybridization number for binary and nonbinary trees

    Get PDF
    Reticulate events play an important role in determining evolutionary relationships. The problem of computing the minimum number of such events to explain discordance between two phylogenetic trees is a hard computational problem. Even for binary trees, exact solvers struggle to solve instances with reticulation number larger than 40-50. Here we present CycleKiller and NonbinaryCycleKiller, the first methods to produce solutions verifiably close to optimality for instances with hundreds or even thousands of reticulations. Using simulations, we demonstrate that these algorithms run quickly for large and difficult instances, producing solutions that are very close to optimality. As a spin-off from our simulations we also present TerminusEst, which is the fastest exact method currently available that can handle nonbinary trees: this is used to measure the accuracy of the NonbinaryCycleKiller algorithm. All three methods are based on extensions of previous theoretical work and are publicly available. We also apply our methods to real data

    Cycle killer... qu'est-ce que c'est? On the comparative approximability of hybridization number and directed feedback vertex set

    Get PDF
    We show that the problem of computing the hybridization number of two rooted binary phylogenetic trees on the same set of taxa X has a constant factor polynomial-time approximation if and only if the problem of computing a minimum-size feedback vertex set in a directed graph (DFVS) has a constant factor polynomial-time approximation. The latter problem, which asks for a minimum number of vertices to be removed from a directed graph to transform it into a directed acyclic graph, is one of the problems in Karp's seminal 1972 list of 21 NP-complete problems. However, despite considerable attention from the combinatorial optimization community it remains to this day unknown whether a constant factor polynomial-time approximation exists for DFVS. Our result thus places the (in)approximability of hybridization number in a much broader complexity context, and as a consequence we obtain that hybridization number inherits inapproximability results from the problem Vertex Cover. On the positive side, we use results from the DFVS literature to give an O(log r log log r) approximation for hybridization number, where r is the value of an optimal solution to the hybridization number problem

    Hybridization Number on Three Rooted Binary Trees is EPT

    No full text

    Hybridization Number on Three Rooted Binary Trees is EPT

    No full text
    Phylogenetic networks are leaf-labeled directed acyclic graphs that are used to describe nontreelike evolutionary histories and are thus a generalization of phylogenetic trees. The hybridization number of a phylogenetic network is the sum of all in-degrees minus the number of nodes plus one. The hybridization number problem takes as input a collection of rooted binary phylogenetic trees and asks to construct a phylogenetic network that contains an embedding of each of the input trees and has the smallest possible hybridization number. We present an algorithm for the hybridization number problem on three binary phylogenetic trees on n leaves that runs in time O(ckpoly(n)) with k the hybridization number of an optimal network and c some (astronomical) constant. For the case of two trees, an algorithm with running time O(3.18kn) was proposed before, whereas an algorithm with running time O(ckpoly(n)), also called an EPT algorithm, had prior to this article remained elusive for more than two trees. The algorithm for two trees uses the close connection to acyclic agreement forests to achieve a linear exponent in the running time, while previous algorithms for more than two trees (explicitly or implicitly) relied on a brute force search through all possible underlying network topologies, leading to running times that are not O(ckpoly(n)) forany c. The connection to acyclic agreement forests is much weaker for more than two trees, so even given the right agreement forest, the reconstruction of the network poses major challenges. We prove novel structural results that allow us to reconstruct a network without having to guess the underlying topology. Our techniques generalize to more than three input trees with the exception of one key lemma that maps nodes in the network to tree nodes in order to minimize the amount of guessing involved in constructing the network. The main open problem therefore is to prove results that establish such a mapping for more than three trees.Optimizatio

    Approximation Algorithms for Nonbinary Agreement Forests

    No full text
    corecore